Skip to content

Conversation

thoo
Copy link
Contributor

@thoo thoo commented Oct 7, 2025

This commit introduces a comprehensive example demonstrating how to build scalable multi-agent systems using the Model Context Protocol (MCP). The architecture supports coordination between multiple MCP servers and specialized agents, designed to scale from a few agents to 100+ agents.

Key components:

  1. MCP Client Manager (mcp_client_manager.py)

    • Manages multiple MCP server connections with robust lifecycle management
    • Implements double-shielded cleanup (anyio + asyncio) to prevent cancel scope conflicts during shutdown
    • Supports concurrent cleanup with configurable timeouts
    • Uses MCPServerStreamableHttpParams for proper connection configuration
  2. Agent Factory (agent_factory.py)

    • Factory pattern for creating and configuring specialized agents at scale
    • Supports dynamic agent registration via AgentConfig
    • Creates custom tool-agents using Runner.run_streamed() with advanced configuration
    • Enables agent orchestration with hierarchical coordination
  3. MCP Servers (servers/)

    • Three example servers demonstrating different domains: - Math server: mathematical operations (add, multiply, power) - Text server: text manipulation (reverse, count words, uppercase) - Data server: data processing (filter, sort, aggregate)
    • Each server built with FastMCP using streamable-http transport
  4. Examples

    • multiple_mcp_servers_example.py: Complete demonstration of orchestrator coordinating specialized agents
    • adding_agents_example.py: Shows how to dynamically add new agent types to the factory
  5. Documentation

    • README.md: Quick start guide and common use cases
    • ARCHITECTURE.md: Detailed architecture guide for scaling patterns
    • servers/README.md: Server setup and usage instructions

Technical highlights:

  • Streaming responses at all levels using ResponseTextDeltaEvent
  • Custom tool functions with @function_tool decorator wrapping Runner.run_streamed()
  • Proper error handling for connection failures and graceful shutdown
  • Logger suppression for harmless SDK cleanup messages
  • Reusable MCP connections across multiple agents for efficiency

This architecture enables building complex multi-agent workflows with proper separation of concerns, robust error handling, and the ability to scale to enterprise-level agent systems.

thein added 2 commits October 6, 2025 23:50
This commit introduces a comprehensive example demonstrating how to build scalable multi-agent systems using the Model Context Protocol (MCP). The architecture supports coordination between multiple MCP servers and specialized agents, designed to scale from a few agents to 100+ agents.

Key components:

1. **MCP Client Manager** (`mcp_client_manager.py`)
   - Manages multiple MCP server connections with robust lifecycle management
   - Implements double-shielded cleanup (anyio + asyncio) to prevent cancel scope conflicts during shutdown
   - Supports concurrent cleanup with configurable timeouts
   - Uses MCPServerStreamableHttpParams for proper connection configuration

2. **Agent Factory** (`agent_factory.py`)
   - Factory pattern for creating and configuring specialized agents at scale
   - Supports dynamic agent registration via AgentConfig
   - Creates custom tool-agents using Runner.run_streamed() with advanced configuration
   - Enables agent orchestration with hierarchical coordination

3. **MCP Servers** (`servers/`)
   - Three example servers demonstrating different domains:
     - Math server: mathematical operations (add, multiply, power)
     - Text server: text manipulation (reverse, count words, uppercase)
     - Data server: data processing (filter, sort, aggregate)
   - Each server built with FastMCP using streamable-http transport

4. **Examples**
   - `multiple_mcp_servers_example.py`: Complete demonstration of orchestrator coordinating specialized agents
   - `adding_agents_example.py`: Shows how to dynamically add new agent types to the factory

5. **Documentation**
   - `README.md`: Quick start guide and common use cases
   - `ARCHITECTURE.md`: Detailed architecture guide for scaling patterns
   - `servers/README.md`: Server setup and usage instructions

Technical highlights:
- Streaming responses at all levels using ResponseTextDeltaEvent
- Custom tool functions with @function_tool decorator wrapping Runner.run_streamed()
- Proper error handling for connection failures and graceful shutdown
- Logger suppression for harmless SDK cleanup messages
- Reusable MCP connections across multiple agents for efficiency

This architecture enables building complex multi-agent workflows with proper separation of concerns, robust error handling, and the ability to scale to enterprise-level agent systems.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines +121 to +133
async def _inner():
# AnyIO shield prevents cancel scopes from interrupting cleanup()
with anyio.CancelScope(shield=True):
await server.cleanup()

if timeout is None:
# Don't allow outer-task cancellation to kill cleanup
with suppress(asyncio.CancelledError):
await asyncio.shield(_inner())
else:
# Timebox *outside* the shields
with suppress(asyncio.TimeoutError, asyncio.CancelledError):
await asyncio.wait_for(asyncio.shield(_inner()), timeout=timeout)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Shielded cleanup uses anyio.CancelScope outside anyio context

The _shielded_cleanup helper wraps server.cleanup() in anyio.CancelScope(shield=True), but the examples that call this manager use asyncio.run and never enter an AnyIO-managed context. Instantiating CancelScope outside AnyIO raises RuntimeError: Cancel scopes can only be used from async code run by anyio, so the cleanup coroutine fails before server.cleanup() executes and the connection is left open. This contradicts the goal of robust shutdown. The shielded section should avoid AnyIO primitives or ensure cleanup is executed from an AnyIO task group.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant